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Background of the Invention 
Field of the Invention 

The present invention relates to a voice 
coding/decoding technology based on A-b-s (Analysis- 
by-Synthesis ) vector quantization. 

Description of the Related Art 

The voice coding system represented by the CELP 
(Code Excited Linear Prediction) coding system based 
on the A-b-s vector quantization is applied when the 
transmission rate of a PCM voice signal is compressed 
from, for example, 64 Kbits/sec (kilobits/seconds) to 
approximately 4 through 16 kbits/ sec. The voice 
coding system is demanded as a system for compressing 
information while maintaining voice quality in an in- 
house communications system, a digital mobile radio 
system, etc. 

FIG. 1 shows the conventional A-b-S vector 
quantization system. 51 is a code book, 52 is a gain 
unit, 53 is a linear prediction synthesis filter, 54 
is a subtracter, and 55 is an error power evaluation 
unit . 



In an A-b-S vector quantization coder, the gain 
unit 52 first multiplies the code vector C read from 
the code book 51 by a gain g. Then, the linear 
prediction synthesis filter 53 inputs the above 
described the scaled code vector, and outputs a 
reproduced signal gAC. Then, the subtracter 54 
subtracts the reproduced signal gAC from an input 
signal X, thereby outputting an error signal E which 
indicates the difference between them. Furthermore, 
the error power evaluation unit 55 computes an error 
power according to an error signal E. The above 
described process is performed on all code vectors C 
in the code book 51 with optimal gains g, the index 
of the code vector C and the gain g which generate the 
smallest error power are computed, and they are 
transmitted to a decoder. 

In an A-b-S vector quantization decoder, the code 
vector C corresponding to the index transmitted from 
the coder is read from the code book 51. Then, the 
gain unit 52 scales the code vector C by the gain g 
transmitted from the coder. Then, the linear 
prediction synthesis filter 53 inputs the scaled code 
vector, and outputs the decoded regenerated signal 
gAC. The decoder does not require the subtracter 54 
and the error power evaluation unit 55. 



As described above, in the A-b-S vector 
quantization coder, an analyzing process is performed 
while a synthesizing (decoding) process is performed 
on a code vector C 

FIG. 2 shows a typical conventional CELP system 
based on the above described A-b-S vector quantization 
system . 

In this CELP system, two types of code books, 
that is, an adaptive code book corresponding to a 
periodic (pitch) sound source and a fixed code book 
corresponding to a noisy (random) sound source. 
According to this system, an A-b-S vector quantizing 
process mainly for the periodic voice (voiced sound, 
etc. ) and a succeeding A-b-S vector quantizing process 
mainly for a noisy voice (unvoiced sound, background 
sound, etc. ) are sequentially performed based on 
respective code books. 

In FIG. 2, 61 is a fixed code book, 62 is an 
adaptive code book, 63 and 64 are gain units, 65 and 
66 are linear prediction synthesis filters, 67 and 68 
are error power evaluation units, and 69 and 70 are 
subtracters. Each of the fixed code book 61 
corresponding to a random sound source and the 
adaptive code book 62 corresponding to a pitch sound 
source are contained in the memory. The gain units 63 
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and 64, the linear prediction synthesis filters 65 and 
66, the error power evaluation units 67 and 68, and 
the subtracters 69 and 70 can be realized by operation 
elements such as a DSP ( digital signal processor ) , 
5 etc . 

In the CELP coder with the above described 
configuration, the portion comprising the adaptive 
code book 62, the gain unit 64, the linear prediction 
synthesis filter 66, the subtracter 70, and the error 

10 power evaluation unit 68 outputs a transmission 
parameter effective for periodic voice. P indicates 
an adaptive code vector output from the adaptive code 
book, b indicates a gain in the gain unit 64, and A 
indicates the transmission characteristic of the 

15 linear prediction synthesis filter 66. 

The coding process performed by this portion is 
based on the same principle as the coding process 
performed by the code book 51, the gain unit 52, the 
linear prediction synthesis filter 53, the subtracter 

20 54, and the error power evaluation unit 55. However, 
a sample in the adaptive code book 62 adaptively 
changes by the feedback of a previous excitation 
signal. The decoder performs a process similar to the 
process performed by the decoding process by the code 

25 book 51, the gain unit 52, and the linear prediction 



synthesis filter 53 described above by referring to 
FIG. 1. However, in this case, a sample in the 
adaptive code book 62 also changes adaptively by the 
feedback of a previous excitation signal . 

On the other hand, the portion comprising the 
fixed code book 61, the gain unit 63, the linear 
prediction synthesis filter 65, the subtracter 69, and 
the error power evaluation unit 67 outputs a 
transmission parameter effective for the noisy signal 
X f output by the subtracter 70 subtracting the optimum 
reproduced signal bAP output by the linear prediction 
synthesis filter 66 from the input signal X, The 
coding process by this portion is based oh the same 
principle as the coding process by the code book 51, 
the gain unit 52, the linear prediction synthesis 
filter 53, the subtracter 54, and the error power 
evaluation unit 55. In this case, the fixed code book 
61 preliminarily stores a fixed sample. The decoder 
performs a process similar to the process performed 
by the decoding process by the code book 51, the gain 
unit 52, and the linear prediction synthesis filter 
53 described above by referring to FIG. 1. 

The fixed code book 61 preliminarily stores a 
random code vector C corresponding to a fixed sample 
value. Therefore, for example, assuming that a vector 



dimension length is 40 (corresponding to the number 
of samples in the period of 5 msec (milliseconds) when 
the sampling frequency is 8 kHz), and that the number 
of vector: code book size is 1024, the fixed code book 
61 requires the memory capacity of 40 k (kilo) words. 

That is, a large memory capacity is required by 
the fixed code book 61 to independently store all 
sample values. This is a big problem to be solved 
when the CELP voice codec is realized. 

To solve this problem, an ACELP (Algebraic Code 
Excited Linear Prediction) system has been suggested 
to successfully perform the code book searching 
process in an algebraic method by arranging a small 
number of non-zero sample values at fixed positions 
(refer to J. P. Adoul et al . T Fast CELP coding based 
on algebraic codes 1 Proc. IEEE International 
conference on acoustics speech and signal processing, 
pp. 1957 - 1960 (April, 1987)). 

FIG. 3 shows the configuration of the 
conventional ACELP system using an algebraic code 
book. An algebraic code book 71 corresponds to the 
fixed code book 61 shown in FIG. 2, a gain unit 72 
corresponds to the gain unit 63 shown in FIG. 2, a 
linear prediction synthesis filter 73 corresponds to 
the linear prediction synthesis filter 65 shown in 
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FIG. 2, a subtracter 74 corresponds to the subtracter 
69 shown in FIG. 2, and an error power evaluation unit 
75 corresponds to the error power evaluation unit 67 
shown in FIG. 2. In the A-b-S process shown in FIG. 
5 3, as in the processes described by referring to FIG. 
1 or 2, an A-b-S process is performed using the code 
vector Ci generated from the algebraic code book 71 
corresponding to an index i, and a gain g. 

In this ACELP system, the required amount of 

10 operations and memory can be considerably reduced by 
limiting the amplitude value and position of a non- 
zero sample. At this time, for example, as shown in 
FIG. 4, the N-dimensional M-size algebraic code book 
71 storing code vectors C 0 , C lf is provided. 

15 However, since the number of non-zero samples in a 
frame is fixed and the non-zero samples are arranged 
at equal intervals, each of the code vectors C 0 , C lf 
• • • / ^m-i can be generated in an algebraic method. In 
the example shown in FIG. 4, the sample position of 

20 each of the four non-zero samples i 0 , i x , i 2 , and i 3 is 
standardized, and the amplitude value is ±1.0. The 
amplitude of the sample position other than the four 
sample positions is assumed to be zero. 

As shown on the right of the algebraic code book 

25 71 shown in FIG. 4, the sample value pattern of the 
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code vector corresponding to i 0 , ± lf i 2 , and i 3 depends 
on the sample positions i 0 , ± lf i 2 , and i 3 within the 
amplitude of ±1 excluding the sample position having 
the amplitude of zero, for example, the pattern 
5 corresponding to the code vector C 0 (0, ...0, +1, 0, 
0, -1, 0, ..,0, +1, 0, ...,0, -1, 0, ...). That 
is, for the code vector having, as elements, a total 
of N samples of four non-zero samples and N - 4 zero 
samples, each of the four non-zero samples i n (n = 0, 
iji 10 1, 2, 3) can be expressed by a total of K + 1 bits, 

}y that is, 1 bit for amplitude information (the absolute 

value of the amplitude is fixed to 1, and indicates 
O only the polarity), and K bits for the position 

y information m n specifying one of 2 k candidates. 

15 The position of a non-zero sample is standardized 

by the G.729 or G. 723.1 of the ITU-T (International 
Telecommunication Union-Telecommunication 
Standardization Secter ) . 

For example, in the table 77 shown in FIG. 4 
20 corresponding to the standard G.729, each position 
information m 0 through m 2 about non-zero samples i 0 
through i 2 in 40 samples corresponding to 1 frame has 
candidates at 8 positions. One position can be 
specified by 3 bits. The position information m 3 
25 about a non-zero sample i 3 has candidates at 16 



positions, and can be expressed by 4 bits to specify 
one of the positions* Each piece of the amplitude 
information s 0 through s 3 about the non-zero samples 
i 0 through i 3 can be expressed by 1 bit because the 
absolute value of each amplitude is fixed to 1.0, and 
the polarity is represented. Therefore, in G.729, the 
non-zero samples i 0 through i 3 can be formed by 17 -bit 
data comprising the amplitude information s 0 through 
s 3 each being formed by 1 bit and the position 
information m 0 through m 3 each being formed by 3 or 4 
bits as shown by 76 in FIG. 4. 

In the table 78 shown in FIG. 4 corresponding to 
the standard 723.1, each position candidate of the 
non-zero samples i 0 through i 3 is determined such that 
the position is assigned to every second sample in the 
non-zero samples. Thus, each piece of the position 
information m 0 through m 3 about the non-zero samples 
i 0 through i 3 can be expressed by 3 bits. As in the 
standard G.729, each piece of the amplitude 
information s 0 through s 3 about the non-zero samples 
i 0 through i 3 can be expressed by 1 bit. As described 
above, in G. 723.1, the non-zero samples i 0 through i 3 
can be formed by 16-bit data comprising the amplitude 
information s 0 through s 3 each being formed by 1 bit 
and the position information m 0 through m 3 each being 
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formed by 3 bits as shown by 76 in FIG. 4. 

For example, when the i-th coded word has the 
value s i n/ m i n (where n = 0, 1, 2, 3), the coded word 
sample c* (n) can be defined by the following 
equation. 

c i (n) = s i 0 5 (n-m i 1 ) + s i 1 8 (n-m i 1 ) 

+ s ^6 (n-m i 2 ) + s i 3 &(n-m 1 3 ) (1) 



where s i n indicates the amplitude information 
about a non-zero sample, and m i n indicates the 
position information about a non-zero sample. In 
addition, 6 ( ) indicates a delta function, and the 
following equations exist. 

6 (n) = 1 for n = 0 

6 (n) = 0 for n # 0 

In addition, the error power E 2 can be expressed 
by the following equation using the input signal shown 
in FIG. 3, the gain g, the code vector C ± , and the 
matrix H of the impulse response of the linear 
prediction synthesis filter 73. 

E 2 = (X - gHCJ 2 2 

The evaluation function argmax ( Fi ) for obtaining 
the minimum error power E 2 can be expressed by the 
following equation . 



argmax ( Fi ) = [(X^C*) 2 / {(HC ± ) T (HC ± )}] 3 

where assuming that: 

X T H = D = d(i) 4, and 

H T H = <I> = <{> ( i , j) 5 

the evaluation function argmax (fi) expressed by 
the equation 3 can be expressed by the following 
equation. 

argmax ( Fi ) = [(D T C ± ) 2 / {(C ± ) T SCi}] 6 

where the characters in the upper case indicate 
vectors . 

Since the above described equations 4 and 5 
contain no elements of the code vector C L , an 
arithmetic operation can be preliminarily performed 
even when the number M of patterns (size) of a coded 
word is large. Therefore, a higher-speed operation 
can be performed by the equation 6 than by the 
equation 3 . 

The process relating to the code vector C L is 
performed on four samples having the amplitude of ±1.0 
as described above. Accordingly, the denominator and 
the numerator of the equation 6 can be respectively 
obtained by the following equations 7 and 8 . 



(D T C d ) 2 = {ELoSiCKi^)} 2 



(7) 
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+ 2E^ 0 E% i+1 s i s J .<(>(/n i ^ J -) 




where 2 3 i=0 indicates the accumulation from i = 0 
through i=3. 

The amount of operations by the equations 7 and 
8 does not depend on the parameter (number of 
dimensions) N, and is small. Therefore, even if 
operations are performed the number of times 
corresponding to the number M of coded word patterns, 
the amount of the operations is not large. Therefore, 
with the configuration using the algebraic code book 
71 shown in FIG. 3, the amount of operations can be 
reduced much more than with the configuration using 
the fixed code book 61 shown in FIG. 2. In addition, 
each code vector output from the algebraic code book 
71 can be generated in an algebraic method according 
to the amplitude information (polarity information) 
and the position information. As a result, it is not 
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necessary to store each code vector in the memory, 
thereby considerably reducing the requirements of the 
memory . 

In the above described ACELP system, the 
requirements of the memory and the amount of 
operations can be successfully reduced. However, 
since the number of non-zero samples in a frame is 
fixed to four, and the restrictions are placed such 
that the positions of samples can be set at equal 
intervals, there is the problem that a bit rate 
representing the code vector index is determined 
according to two parameters, that is, the frame length 
parameter and the non-zero sample number parameter, 
thereby requiring a comparatively large number of bits 
to express a code vector index ♦ 

For example, when one frame contains 40 samples 
according to the standard G.729 of the ITU-T, a total 
of 17 bits are used as a code vector index as shown 
in the table 77 shown in FIG, 4. The number of the 
bits corresponds to 42% of the total transmission 
capacity (8 kbits/sec, 80 bits/10 msec) prescribed by 
G.729. 

If one frame contains 80 samples, the number of 
bits required to express the position information 
about a non-zero sample is larger by one than in the 
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above described case. Therefore, a total of 21 bits 
are used as a code vector index. The number of bits 
corresponds to 62.5% of the total transmission 
capacity prescribed by G.729, and is much larger than 
in one frame containing 40 samples. 

Normally, to realize a very low bit rate voice 
CODEC at about 4 kbits/sec, a frame length should be 
extended. However, when the above described 

conventional ACELP system is applied to this 
requirement, there arises the problem of a 
considerable increase of the transmission bit rate of 
a code vector index. That is, the conventional ACELP 
system has the problem that it interrupts a demand to 
lower a bit rate by decreasing the number of parameter 
transmission bits per unit time through higher 
transmission efficiency. 

In addition to this problem, the conventional 
ACELP system also has the problem that the ability to 
identify a pitch period shorter than a frame length 
is lowered when the frame length is extended. 

Summary of the Invention 

The present invention has been developed based 
on the above described background, and aims at setting 
a constant transmission amount of a code vector index 
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and maintaining the identifying ability for a pitch 
period in a voice coding/decoding system based on the 
A-b-S vector quantization using a sound source coded 
word formed only by non-zero amplitude values. 

The present invention relates to a voice coding 
technology based on the analysis-by-synthesis vector 
quantization using a code book in which sound source 
code vector are formed only by non-zero amplitude 
values, and variably controls the sample position of 
a non-zero amplitude value using an index and a 
transmission parameter indicating a feature amount of 
voice. In this case, a lag value corresponding to a 
pitch period can be used as a transmission parameter. 
Furthermore, a pitch gain value can also be used. 
Corresponding to a lag value or a pitch gain value, 
the sample position of a non-zero amplitude value can 
be redesigned within a period corresponding to the lag 
value . 

With the above described configuration, the 
position of a non-zero sample output from a code book 
in the A-b-S vector quantization can be changed and 
controlled using an index and a transmission parameter 
indicating the feature amount of voice such as a lag 
value, a pitch gain, etc. As a result, according to 
the present invention, it is not necessary to increase 



16 

the number of necessary transmission bits even when 
a frame length is extended, thereby successfully 
avoiding the deterioration of the transmission 
efficiency. 

In addition, the present invention has the merit 
that the pitch periodicity can be easily reserved with 
a pitch emphasizing process, etc even in a longer 
frame. 

Brief Description of the Drawings 

Other objects and features of the present 
invention can be easily understood by one of ordinary 
skill in the art from the descriptions of preferred 
embodiments by referring to the attached drawings in 
which: 

FIG. 1 shows the conventional A-b-S vector 
quantization; 

FIG. 2 shows the conventional CELP system; 

FIG. 3 shows the configuration according to the 
conventional ACELP system; 

FIG. 4 shows the outline of the ACELP system; 

FIG. 5 shows the principle of the present 
invention (coding search process); 

FIG. 6 shows the principle of the present 
invention ( regenerating process on the decoding side ) ; 



FIG. 7 shows the first preferred embodiment 
according to the present invention ( coding search 
process ) ; 

FIG. 8 shows the first preferred embodiment 
according to the present invention (regenerating 
process on the decoding side); 

FIG. 9 is a flowchart of the first preferred 
embodiment according to the present invention; 

FIGS. 10A through IOC show the configuration- 
variable code book using a lag value according to the 
preferred embodiment of the present invention; 

FIG. 11 shows the non-zero sample position 
corresponding to a lag value according to the 
preferred embodiment of the present invention; 

FIG. 12 shows the pitch emphasizing process; 

FIG. 13 shows the second preferred embodiment 
according to the present invention ( coding search 
process ) ; 

FIG. 14 shows the second preferred embodiment 
according to the present invention ( regenerating 
process on the decoding side ) ; 

FIG. 15 is a flowchart according to the second 
preferred embodiment of the present invention; and 

FIGS. 16A through 16C show waveform examples of 
each signal. 
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Description of the preferred embodiments 

The preferred embodiment of the present invention 
are described below by referring to the attached 
drawings . 

FIGS. 5 and 6 show the principle of the present 
invention. 1 and 1 T are configuration variable code 
books, 2 and 2 1 are gain units, 3 and 3 f are linear 
prediction synthesis filters, 4 is a subtracter, 5 is 
an error power evaluation unit. 

The configuration variable code books 1 and 1 1 
correspond to an algebraic code book for outputting 
a code vector comprising, for example, a plurality of 
non-zero samples, and has the function of 
reconstructing itself by controlling the position of 
non-zero samples based on an index i and a 
transmission parameter p such as a pitch period ( lag 
value), etc. At this time, the configuration variable 
code books 1 and 1 1 variably control the position of 
non-zero samples without changing the number of non- 
zero samples. Thus, the number of necessary bits for 
transmission of a code vector index can be prevented 
from increasing. 

In the coder with the principle configuration 
according to the present invention shown in FIG. 5, 
after the position of a non-zero sample is controlled 
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according to an index i and a transmission parameter 
p, the gain unit 2 first scales the code vector Ci 
output from the configuration variable code book 1 by 
the gain g. Then, the linear prediction synthesis 
filter 3 inputs the above described scaled code vector 
, and outputs a reproduced signal gACi. Then, the 
subtracter 4 subtracts the above described reproduced 
signal gACi from the input signal X> and outputs the 
difference between them as an error signal E. Next, 
the error power evaluation unit 5 computes error power 
according to an error signal E. The above described 
process is performed on all code vectors Ci output 
from the configuration variable code book 1, and 
plural types of gains g, computes the index i of the 
code vector Ci and the gain g with which the above 
described error power is the smallest, and they are 
transmitted to the decoder. 

In the decoder with the principle configuration 
according to the present invention shown in FIG. 6, 
a parameter separation unit 6 separates each parameter 
from received data transmitted from the coder. Then, 
the configuration variable code book 1 ? outputs a code 
vector Ci according to the index i and the 
transmission parameter p in the above described 
separated parameters. Next, the gain unit 2 1 scales 
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the above described code vector Ci by the gain g 
separated by the parameter separation unit 6. Then, 
the linear prediction synthesis filter 3 ? inputs the 
scaled code vector, and outputs the decoded 
regenerated signal gAC. A linear prediction 

parameter, not shown in FIG. 6, is provided for the 
linear prediction synthesis filter 3 f by the parameter 
separation unit 6 . 

Various transmission parameters p in the 
configuration shown in FIGS, 5 and 6 can be selected 
corresponding to the characteristics of a voice 
signal. For example, a pitch period (lag value), a 
gain, etc. can be adopted. 

FIGS. 7 and 8 shows the first embodiment 
according to the principle configuration shown in 
FIGS. 5 and 6. 11 and 11 1 are configuration variable 
code books, 12 and 12 1 are gain units, 13 and 13 ' are 
linear prediction synthesis filters, 14 is a 
subtracter, 15 is an error power evaluation unit, 16 
is a non-zero sample position control unit, 17 is a 
pitch emphasis filter, and 18 is a parameter 
separation unit. 

As shown at the middle and lower parts in FIG. 
7 (and in FIG. 8), the configuration variable code 
books 11 and ll 1 comprise a non-zero sample position 



control unit 16 for inputting an index i and a pitch 
period (lag value) 1 which is a transmission 
parameter; and a pitch emphasis filter 17 for 
inputting an output signal of the non-zero sample 
position control unit 16 and a pitch period (lag 
value) 1. The non-zero sample position control unit 
16 does not change the number of non-zero samples, but 
variably controls the position of a non-zero sample 
based on the pitch period (lag value) 1. The pitch 
emphasis filter 17 is a feedback filter for 
synthesizing a sample longer than the length 
corresponding to a lag value from a previous lag value 
when the lag value is shorter than the length of a 
frame . 

The function of each unit shown in FIGS. 7 and 
8 can also be realized by operation elements such as 
a DSP (digital signal processor), etc. 

In the conventional ACELP system, non-zero 
samples have been assigned such that they can be 
stored in the entire range of a frame depending on the 
frame length. However, when a lag value corresponding 
to the pitch period is smaller than the length of a 
frame, a sample longer than the length corresponding 
to the lag value can be designed to be synthesized 
from a previous lag value using a feedback filter. 
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In this case, it: is wasteful to assign non-zero 
samples in a range larger than one corresponding to 
the lag value in a frame. 

According to the present embodiment, the non-zero 
5 sample position control unit 16 assigns a non-zero 
sample within a pitch period, that is the range of the 
lag value. Simultaneously, when the lag value exceeds 

gQ the value corresponding to a half of the frame length, 

i .. i 

iji the non-zero sample position control unit 16 removes 

JrjJ 10 some of the non-zero samples, assigned to the last 

1^ half having a smaller influence of the feedback 

process by the pitch emphasis filter 17, in the non- 
jjj zero samples assigned in a pitch periode, and variably 

*; controls the positions of the non-zero samples. Thus, 

0 15 even if the lag value and the frame length change, the 

constant number of non- zero samples can be maintained, 
thereby preventing the number of necessary bits in 
atransmitting code vector index from increasing. 

First, the entire operation of the configuration 
20 according to the first embodiment shown in FIGS. 7 and 
8 is the same as the operation of the principle 
configuration shown in FIGS. 5 and 6. 

FIG. 9 is a flowchart of the operations process 
performed by the non-zero sample position control unit 
25 16 designed in the configuration variable code books 
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11 and ll 1 shown in FIGS. 7 and 8. In the example 
described below, one frame contains 80 samples (8 kHz 
sampling), the number of non-zero samples is 4, the 
lag value equals 20 samples (400 Hz) through 147 
samples (54.4 Hz), and the index transmission bit 
equals 17 bits . 

First, the position of a non-zero sample is 
initialized (step Al in FIG. 9). In this step, non- 
zero sample positions i = 0 through 39 are set at 
equal intervals for the array data smp_pos [i] (0 < 
i < 40 ) containing 40 elements. 

Then, a lag value corresponding to an input pitch 
period is determined. The lag value is not shown in 
FIG. 7 or 8, but can be computed in the A-b-S process 
(corresponding to the configuration at the upper part 
of FIG. 2), to be performed before the ACELP process, 
using an adaptive code book. 

First, it is determined whether or not the lag 
value is smaller than the first set value of 40 ( step 
A2 in FIG. 9). If the determination is YES, then the 
process in step A6 shown in FIG. 9 is performed, and 
each non-zero sample position is entered. 

As a result, when the lag value corresponding to 
the pitch period is equal to or smaller than 40, then 
the position of a non-zero sample is determined as 
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shown in FIG. 10A. The arrangement is the same as 
that shown on table 77 in FIG. 4 corresponding to the 
above described ITU-T standard G.729. 

On the other hand, when the determination in step 
A2 shown in FIG. 9 is NO, it is determined whether or 
not the second set value of lag value is equal to or 
larger than 80 (step A3 in FIG. 9). If the 
determination is NO, the contents of the array data 
smp_pos [] are sequentially changed in the for loop 
process in the process of controlling the position of 
a non-zero sample in step A5 shown in FIG. 9. Then, 
using the changed array data, the process of entering 
the position of the non-zero sample in step A6 is 
performed. 

As a result, when the lag value corresponding to 
a pitch period is larger than 40 and smaller than 80, 
for example, when it is 45, the position of a non-zero 
sample is determined as shown in FIG. 10B. As shown 
in FIG. 11, the arrangement is obtained by adding the 
sample positions 40, 42, and 44 replacing the sample 
positions 35, 37, and 39 in the arrangement shown in 
the table in FIG. 10A. 

Practically, if the lag value is, for example, 
45, i = 0, ix = 40, and iy = 0 as initial values, and 
(lag -41)/2+l=3, then three sample positions 



are position-controlled . That is, the operation of 
smp_pos [39 - iy] = ix is performed using ix = 40 and 
iy = 0. In the sample position data smp_pos [39], the 
sample position 40 replaces the sample position 39. 
Then, ix = 42 and iy = 2 are obtained using ix + = 2 
and iy + = 2, the sample position 42 replaces the 
sample position 37 in the sample position data smp_pos 
[37] . Furthermore, using the values ix = 44 and iy 
= 4, the sample position 44 replaces the sample 
position 35 in the sample position data smp_pos [35] . 

As described above, when the lag value 
corresponding to the pitch period is larger than 40 
and smaller than 80 according to the present 
embodiment, the sample positions are removed by the 
number of samples corresponding to the increase from 
the lag value of 40 so that the positions are 
reconstructed within the range of the lag value, 
thereby reconstructing the positions without changing 
the number of non-zero samples. 

When the determination in step A3 shown in FIG. 
9 is YES, the clipping process in step A4 shown in 
FIG. 9 is performed. That is, when the lag value 
exceeds 80 corresponding to the frame length, it is 
insignificant to assign a non-zero sample outside the 
range of the frame length. Therefore, when the lag 
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value is clipped at: 80 , the process of controlling the 
positions of non-zero samples in step A5 shown in FIG. 
9, and the subsequent process of entering the 
positions of non-zero samples in step A6 are 
5 performed. As a result, the positions of non-zero 
samples are determined as shown in FIG. 10C. 

In the above described control process, the 
ifj positions of non-zero samples are reconstructed 

IS corresponding to the lag value even when the lag value 

•- 3 10 increases. Therefore, it is possible to maintain the 

!U number of bits of 17 to be transmitted for a code 

vector index without changing the number of non-zero 
|5 samples. 

^ FIG. 12 shows the pitch emphasis process 

hQ 15 performed by the pitch emphasis filter 17 forming 

*" parts of the configuration variable code books 11 and 

ll 1 shown in FIGS. 7 and 8. 31 and 34 are coefficient 
units, 32 is an adder, and 33 is a delay circuit. 

In FIG. 12, the transmission function of the 
20 configuration including the coefficient units 31 and 
34, the adder 32, and the delay circuit 33 can be 
expressed by P(z) = a / (1 - pz~ lag ). a is the 
coefficient of the coefficient unit 31, p is the 
coefficient of the coefficient unit 34, lag indicates 
25 a lag value. For example, the coefficient a of the 
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coefficient unit: 31 is a = 1. 0 in *the range of 0 
through (lag - 1), and a = 0. 0 in the range of lag 
through 79. The coefficient p of the coefficient unit 
34 is 1. 0. The coefficients a and p are not limited 
to these values, but can be set to other values. 

With the above described circuit configuration, 
when the lag value is smaller than the frame length, 
a sample having the length larger than the value 
corresponding to the lag value in the frame is fed 
back from the previous lag value and synthesized. As 
a result, a sequence can be generated in 
synchronization with the pitch period, while 
maintaining the representability of pitch periodicity. 

FIGS. 13 and 14 show the second embodiment of the 
present invention based on the principle configuration 
shown in FIGS. 5 and 6. 21 and 21 1 are configuration 
variable code books, 22 and 22 1 are gain units, 23 and 
23' are linear prediction synthesis filter 23, 24 is 
a subtracter, 25 is an error power evaluation unit, 
26 is a non-zero sample position control unit, 27 is 
a pitch synchronization filter, and 28 is a parameter 
separation unit. 

The entire operation of the configuration 
according to the second embodiment shown in FIGS. 13 
and 14 is the same as the operation according to the 
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principle configuration described by referring to 
FIGS. 5 and 6. 

The configuration variable code books 21 and 21 1 
comprise the non-zero sample position control unit 26 
and the pitch synchronization filter 27 as with the 
configuration variable code books 11 and 11* (shown 
in FIGS. 7 and 8) corresponding to the first 
embodiment of the present invention. The 
configuration according to the second embodiment is 
different from the first embodiment in that the non- 
zero sample position control unit 26 and the pitch 
synchronization filter 27 input a pitch gain G in 
addition to the lag value 1 corresponding to the pitch 
period as a transmission parameter. 

As a lag value corresponding to the pitch period 
computed in the A-b-S process (corresponding to the 
upper half of the configuration shown in FIG. 2) using 
an adaptive code book, the most probable value in the 
search range is selected even when input voice has no 
definite pitch period. Therefore, in the region of 
an unvoiced sound or a background sound for which a 
noisy sound source is appropriate, a pseudo-pitch 
period is extracted, and the information about the 
pitch period is transmitted from the coder to the 
decoder. In this case, a big pitch gain G indicates 
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a storong pitch periodicity, and a small pitch gain 
G indicates a weak pitch periodicity such as an 
unvoiced sound, a background sound, etc. According 
to the second embodiment of the present invention, a 
5 pitch gain G is adopted as one of the transmission 
parameters . 

FIG. 15 is a flowchart of the operating process 
performed by the non-zero sample position control unit 
26 in the configuration variable code books 21 and 21 1 

10 shown in FIGS. 13 and 14. In this flowchart, the 
control processes in steps Bl, B3, B4, B7, B5, and B6 
are the same as the processes in steps Al, A2, A3, A4, 
A5, and A6 in the flowchart shown in FIG. 9 
corresponding to the first embodiment of the present 

15 invention. 

The second embodiment is different from the first 
embodiment in the process performed when the pitch 
gain G is smaller than a predetermined threshold. 
That is, in step B2 shown in FIG. 15, it is determined 

20 whether or not the pitch gain G is smaller than the 
threshold. If the determination is YES, then the 
setting of a pitch period is insignificant, and 
therefore, the lag value is clipped at 80, which 
equals the frame length, and the same process as in 

25 the first embodiment is performed. 
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In the above described control process, the 
characteristics of the present embodiment can be 
furthermore improved . 

FIGS. 16A through 16C show input voice X 
(corresponding to the X shown in FIGS. 16A and 2), 
noisy input signal X 1 (corresponding to the X' shown 
in FIGS. 16B, 5, etc. ) to the present embodiment, and 
an example of each waveform (FIG. 16C) from the 
configuration variable code book (1 shown in FIG. 5, 
etc. ) of the present invention. 

The embodiments of the present invention are 
described above, but the present invention is not 
limited only to the described embodiments, but 
additions and amendments can be made to them. For 
example, the frame length, the number of samples, etc. 
can be optionally selected corresponding to an 
applicable system. In addition, a transmission 
parameter corresponding to, for example, the format 
of a vowel can be used. Furthermore, the present 
invention can be applied not only to the ACELP system, 
but also to a voice coding system in which a plurality 
of non-zero samples are used and the positions of the 
non-zero samples are controlled using a transmission 
parameter . 



